Task Scheduling in Big Data - Review, Research Challenges, and Prospects
نویسندگان
چکیده
In a Big data computing, the processing of data requires a large amount of CPU cycles and network bandwidth and disk I/O. Dataflow is a programming model for processing Big data which consists of tasks organized in a graph structure. Scheduling these tasks is one of the key active research areas which mainly aims to place the tasks on available resources. It is essential to effectively schedule the tasks, in a manner that minimizes task completion time and increases utilization of resources. In recent years, researchers have discussed and presented different task scheduling algorithms. In this research study, we have investigated the state-of-art of various task scheduling algorithms, scheduling considerations for batch and streaming processing, and task scheduling algorithms in the wellknown open-source big data platforms. Furthermore, this study proposes a new task scheduling system to alleviate the problems persists in the existing task scheduling for big data. Keywords—Big Data, MapReduce, Dataflow, Task Scheduling Model, Twister2, Static and Dynamic Task Scheduling.
منابع مشابه
Task Scheduling in Fog Computing: A Survey
Recently, fog computing has been introducedto solve the challenges of cloud computing regarding Internet objects. One of the challenges in the field of fog computing is the scheduling of tasks requested by Internet objects. In this study, a review of articles related to task scheduling in fog computing has been done. At first, the research questions and goals will be introduced, an...
متن کاملIntegrated modeling and solving the resource allocation problem and task scheduling in the cloud computing environment
Cloud computing is considered to be a new service provider technology for users and businesses. However, the cloud environment is facing a number of challenges. Resource allocation in a way that is optimum for users and cloud providers is difficult because of lack of data sharing between them. On the other hand, job scheduling is a basic issue and at the same time a big challenge in reaching hi...
متن کاملSmall Hydro-Power Plants in Kenya: A Review of Status, Challenges and Future Prospects
Small Hydro-power Plants (SHP) are an important source of electricity in many countries. However, little is known about SHP in Kenya. This paper reviews the status, challenges in implementation of SHP and prospects for future development of SHP in Kenya. The paper shows that SHP has not yet fully utilized the available hydro-power potential. The challenges associated with SHP development should...
متن کاملBig data preprocessing: methods and prospects
The massive growth in the scale of data has been observed in recent years being a key factor of the Big Data scenario. Big Data can be defined as high volume, velocity and variety of data that require a new high-performance processing. Addressing big data is a challenging and time-demanding task that requires a large computational infrastructure to ensure successful data processing and analysis...
متن کاملA Review of Data Intensive Computing
Data intensive computing is a common research problem in science, industry and computer academia. In recent twenty years, the explosive growth of science data has appeared all over the world. Typical data intensive computing applications include Internet text data processing, scientific research data processing, large scale graph computing, inverse and perspective problems. Data intensive compu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017